home *** CD-ROM | disk | FTP | other *** search
- ART-CEE
- USER'S GUIDE
-
- ART-CEE is a generic inference engine. It was written in MIX C on an
- Apricot F1, a run-of-the-mill MS-DOS machine. The source and load modules
- have been successfully compiled and executed on IBM XT, AT and clone machines
- and should port easily to any MS-DOS machine. The source code also should
- be fairly easy to convert to other C dialects.
-
- Inference engines are a category of artificial intelligence programs.
- Like traditional databases, they structure information for later retrieval.
- But they also allow can guide the user through solving problems, point out
- unnoticed connections within the data and suggest new possibilities. They
- are capable of working with certainties as well as 'fuzzy' information
- which is incomplete or tentative.
-
- Information is entered into inferences engines as rules. In ART-CEE's
- case, these rules are "If...then" propositions. Such rules should be closely
- related and basic to the body of knowledge concerned; the quality of the
- system built with the inference engine depends most on the accuracy and ade-
- quacy of these rules. The user should enter enough rules to draw the major
- connections between the data items, but there is no need to enter everything
- that there is to know--after all, inference engines are designed to draw
- their own inferences.
-
- Information is retrieved from inference engines through queries. A query
- is a question or a command which causes the inference engine to search the
- database, extract the information it knows immediately and/or draw the
- necessary inferences to come as close as possible to meeting the request, and
- display both its conclusions and how it arrived at them. ART-CEE has three
- kinds of queries that allow a broad range of services to you.
-
- Each rule in ART-CEE is associated with a percentage of truth, occurrence
- or certainty. If the rule is always true its percentage is 100; if it is
- never true its percentage is 0. ART-CEE also uses this number of mark some
- rules as logically impossible and to prevent the entering of tautologies.
- These percentages are used to control the query process.
-
-
- GETTING STARTED
- Before you do anything, make a copy of all ART-CEE files, hide that
- copy from neighbors, friends, angry spouses, magnetic fields, dust and
- nuclear explosion.
-
- ART-CEE is delivered as four source files. Compile each source file
- individually. If you are using the MIX system, you may wish to optimize
- WORKUP2.C for speed. Then link the four object files together and name the
- output file ART-CEE.COM. The program will run faster if you link the runtime
- functions with the WORKUP objects, but this will increase the size of the
- final .COM file to above 60K.
-
- If you find that ART-CEE's space requirements exceed what is available,
- the amount of 'stack' and 'heap' used can be affected greatly by adjusting
- the value of MAX in the header files. ART-CEE is supplied with a value of 60
- for that variable. All source files use carry exactly the same value for MAX.
-
- Locate the following files in the same directory on the default drive:
- ART-CEE.COM, HELP1.AIH, HELP2.AIH, HELP3.AIH, HELP4.AIH. If you chose not to
- link the MIX runtime functions with the WORKUP objects, the MIX C library
- functions also must reside in that directory, unless a PATH command has been
- executed. The runtime module also must have access to the MS-DOS i-o routines
- on the system disk; screen blanking will not succeeed if these routines are
- not available on the default drive or through a PATH command.
-
-
- MAIN MENU AND PROMPT
- The main input screen displays a menu of all available input options.
-
- Commands and default settings may be addressed by entering the single
- letter in the appropriate menu phrase. For instance, to load a data file,
- enter "L". Usually the letter to be entered is the first letter in the
- phrase, but in any case it is the only capitalized letter. Do not enter
- the full word or phrase; ART-CEE will try to interpret that entry as a rule
- or subject. ART-CEE will convert your inputs to all-capitals; the program
- will execute more quickly if you enter them as capitals in the first place.
- All inputs to ART-CEE must end in the "Return" or "Enter" key. Beginning with
- version 1.4, inputs are processed without regard to case.
-
- Rules and queries demand a more involved input. Any input at the main
- prompt that is longer than one character will be treated as a rule or query.
-
-
- RULES
- All rules must begin with the word "IF " and contain the word " THEN ".
- Between these words must be a subject, and after the " THEN " must be a
- predicate (apologies to English teachers who know that such phrases are not
- grammatical subjects and predicates). The entire rule cannot be longer than
- eighty characters. Neither subject nor predicate should contain punctuation.
- It follows that the rule should not end in punctuation (the reason for this is
- that later matching against the predicate would have to include the same
- punctuation, which probably would be incorrect for that usage and a mess to
- remember). Any leading grammatical article ("A ", "AN " or "THE ") in sub-
- ject or predicate will be ignored.
-
- ART-CEE searches the database on each rule input to determine if it already
- knows the subject or predicate. If it finds that the rule already exists, you
- are given the opportunity to enter new percentages of occurrence for that rule.
- Otherwise, if there is room in the database for the new rule, the rule is
- added to the knowledge base. If override of default percentages is turned on,
- you are asked to input forward and reverse percents. Otherwise the defaults
- are used.
-
- ART-CEE can handle up to MAX number of different subjects and/or predi-
- cates. Internally, subjects and predicates are stored identically, as the
- subject of one rule likely will be the predicate of another. The maximum
- number of rules that can be handled is MAX * (MAX -1).
-
- A forward percentage of occurrence refers to the percent of time that
- the rule is true as entered. If fifty percent of all humans are female, then
- the rule "IF HUMAN THEN FEMALE" would have a forward percentage of 50.00000
- (trailing zeroes need not be entered). Reverse percentages refer to the
- percent of time that the rule is true in reverse format. Using the example
- above, if one percent of all females are human, the reverse percentage for
- "IF HUMAN THEN FEMALE" would be 1.000000. Percentages must be not less than
- zero (zero means "never true") and less than one hundred (one hundred means
- "always true"). ART-CEE stores impossible rules with negative percentages
- of occurrence, but you cannot enter them at rule entry time (see commands B
- and G).
-
-
- QUERIES
- Three query formats are available in ART-CEE: simple query, two-element
- query and thinking.
-
- The simple query is a request for all rules concerning just one subject
- in the database. It reports only those rules which contain the subject as
- subject or predicate; no inferences are drawn. Only positive rules are
- reported; if a potential rule has not been entered, or if the rule is marked
- as impossible, it is not reported. Simple queries are entered by entering
- any of the following phrases, followed by the subject: WHO, WHO IS, WHO IS A,
- WHO IS AN, WHO IS THE, WHAT, WHAT IS, WHAT IS A, WHAT IS AN, WHAT IS THE,
- DESCRIBE, DESCRIBE A, DESCRIBE AN, or DESCRIBE THE. The subject may be
- followed by a question mark, which is ignored. ART-CEE searches the database
- for an exact match on the subject. If the subject is found, all forward
- rules for that subject are reported to the computer monitor, followed by all
- reverse rules.
-
- The two-element query asks is a certain rule is true. Using the above
- example, to find out what percentage of all females are human, the input
- would be: "IF FEMALE THEN HUMAN?" Note that the input is exactly like
- the input for the entering of the rule, except that the rule ends in a ques-
- tion mark. All parsing rules applicable for rules are applicable for two-
- element queries. If subject and predicate are in the database, the query
- begins. If the rule already exists in the database it is reported, and the
- query search ends. If the rule does not exist (ie., is marked as having a
- zero percentage of occurrence), ART-CEE attempts to find any way possible to
- chain together enough inferences to draw a conclusion about the rule. For
- instance, suppose that ART-CEE does not know directly how many females are
- human, but it does know the following rules:
- IF FEMALE THEN MAMMAL 20%
- IF MAMMAL THEN HUMAN 3%
- ART-CEE will link these two rules together and conclude that the rule "IF
- FEMALE THEN HUMAN" is true 0.6% of the time (20% times 3%). It reports
- the chain that it used to draw this conclusion on the monitor and asks if
- you agree with each step in the chain. If you disagree with any step, the
- entire chain beginning with the part that you rejected is abandoned, and
- the search continues. If you agree with the entire chain you will be asked
- if you wish to add the new fact ("IF FEMALE THEN HUMAN" 0.6%) to the data-
- base and complies with your decision. The search resumes to find other ways
- that FEMALE and HUMAN can be linked until all possible links have been
- examined.
-
- The two-element query also can work with incomplete data. Sometimes it
- is possible to make a connection between two subjects only if one additional
- rule is assumed to be true. For instance, suppose that the following rules
- are known:
- IF FEMALE THEN MAMMAL 20%
- IF MAMMAL THEN TWO-LEGGED 3%
- IF TAILLESS THEN HUMAN 10%
- ART-CEE cannot conclude IF FEMALE THEN HUMAN unless one of the following
- assumptions is made: FEMALEs are TAILLESS, MAMMALs are TAILLESS, TWO-LEGGED
- implies TAILLESS, MAMMALs are HUMAN or TWO-LEGGED implies HUMAN. By setting
- the default number of assumptions on the main menu (function A), the number
- of assumptions which will be included in each attempt to chain subjects to-
- gether is controlled. Changing the number of assumptions to 1 allows 1
- such assumption per attempt, etc. The maximum number of assumptions allowed
- in any one chain is MAX - 3. Increasing the number of allowed assumptions
- increases the power of the two-element query, but is also geometrically in-
- creases the effort both program and user must exert to get through the query.
- If the number of assumptions is nonzero ART-CEE may request information from the
- user at what seems to be odd moments. When the user is asked to agree to the
- assumption as drawn, a "Y" response will result in the new rule's addition
- to the database, and the user will be asked for the percentage of occurrence
- for that rule, whether or not the override of defaults is on.
-
- The final query form, "thinking", is an automated extension of the two-
- element query. Every subject in the database is chained to every other sub-
- ject in the database. Before the query is executed, ART-CEE goes through the
- database marking all logical impossibilities. These impossibilities take
- the form of:
- IF A THEN B cannot be true (ie., IF A THEN B has a negative percentage)
- IF C THEN B is always true (ie., IF C THEN B has a 100% percentage)
- Therefore, IF A THEN C can never be true.
- Then ART-CEE begins chaining. Only rules with positive percentages are in-
- cluded in the chains (ie., no assumptions are drawn). All chains are extended
- as far as they will go, up to the think depth setting shown on the main menu.
- That is, if the think depth setting is 3, then no chain will be extended be-
- yond three subjects. The minimum depth setting is three, and the maximum
- setting is MAX - 1. While increasing the depth setting increases the prob-
- ability of finding all possible inferences, it also greatly increases the
- time necessary to perform the query. Seldom does a depth of more than four
- or five prove efficient. All connections drawn in the "thinking" function
- are applied to the database, and the greatest percentage found for each rule
- is the final one saved in the database. No impossible rules are changed.
-
-
- COMMANDS
- A brief discussion of each rule follows, each prefixed by the single-
- character entry that invokes the rule.
-
- A Set number of assumptions that will be allowed in any single chain
- in the two-element query. Minimum number is 0; maximum is MAX - 3.
-
- B Enter a file containing mutually exclusive subjects, and mark all
- occurrences of those subjects in the database as mutually exclusive.
- For instance, assume that a file contains the following facts:
- DOG, CAT, PIG, HORSE, COW. Assume also that the database contains
- the subjects DOG, CAT, HORSE, COW. This rule will mark the follow-
- ing rules as impossible:
- IF DOG THEN CAT IF DOG THEN HORSE IF DOG THEN COW
- IF CAT THEN DOG IF CAT THEN HORSE IF CAT THEN COW
- IF HORSE THEN DOG IF HORSE THEN CAT IF HORSE THEN COW
- IF COW THEN DOG IF COW THEN HORSE IF COW THEN CAT
- The PIG item in the file is ignored.
-
- C Change a subject without changing any percentages associated with it.
- For instance, the subject "PIG" could be changed to "SWINE". Then
- all rules that referenced "PIG" will now reference "SWINE".
-
- D Drop a rule. Suppose that the database contains the rule "IF COLLIE
- THEN DOG". By selecting this option, the rule can be marked as having
- a zero percentage of occurrence. If this is the only rule referencing
- COLLIE, then the subject COLLIE is erased. The same would occur to
- DOG is this was the only rule referencing DOG.
-
- F Set default forward percentage of occurrence. Valid values for this
- default are not less than zero and not greater than one hundred. This
- setting can be used to great advantage if a large number of similarly-
- occurring rules are to be entered at the same setting. Set the default
- appropriately, set the default reverse percentage as well, turn off
- the override option, and just enter the rules. All prompts for per-
- centage inputs will be skipped.
-
- G Enter a group of mutually exclusive subjects from the keyboard. The
- function works the same way as "B", except that the keyboard is the
- source of information. A single letter "E" ends the input stream and
- begins the marking of subjects. Any number of subjects can be en-
- tered, but only those actually found in the database will be stored.
- After the subjects are marked, you will be given opportunity to save
- the group for later use as an input file under option "B".
-
- H Help screens are available online. These screens are an abbreviated
- version of this user's guide. The function will work only if the
- help files are located in the default directory on the default drive
- or an MS-DOS PATH command was issued prior to entering ART-CEE.
-
- I Initialize the database. This function erases all subjects in the
- databased marks all percentages of occurrence as zero except for
- tautologies (IF A THEN A), which are marked as impossible.
-
- K Set depth of thinking chaining. See the discussion of thinking under
- queries, above.
-
- L Load a data file from disk. The database developed in a previous
- ART-CEE session can be reloaded using this function. If the file con-
- tains fewer than MAX subjects, the data in the file will overlay the
- contents of the database now occupying the positions from which the
- file was written and leave the remainder of the database untouched. If
- the highest-numbered subject in the file exceeds the current value of
- MAX, the file cannot be successfully loaded.
-
- M Toggle the showing of the main menu. If the current setting is "Y"
- (show the menu), choosing this function will change the setting to
- "N" (do not show the menu). Once the commands and query/rule format
- are well-known, significant time can be saved by switching the setting
- to "N".
-
- O Toggle the overriding of default percentages of occurrence. If the
- current setting is "Y" (yes), choosing this function will change the
- setting to "N" (no), and vice versa.
-
- P Print the database. All rules in the database will be copied to the
- printer (LPT1), together with their percentages of occurrence. These
- rules will be grouped by subject, all forward references first and
- then∞ reverse references. Therefore, all rules will be printed
- twice. If all subjects are used and all subjects have a completely
- filled set of rules, the number of lines printed will be MAX * 2 *
- (MAX + 2).
-
- R Set the default percentage used for reverse references. Valid values
- are not less than zero and not greater than one hundred.
-
- S Save the database to disk. The database can be saved in whole or
- in part. To save just part of the database, enter the starting and
- ending positions when prompted (use function V to determine what
- subject is in what position). If the present database was loaded
- from disk, the filename from which the database was loaded will be
- offered to you as a default, otherwise the filename "ART-CEE.DAT" will
- be offered. The default can be overridden by entering any other
- name when prompted. If a file by the chosen name already exists, it
- will be overwritten without backup.
-
- T Think. See discussion under "QUERIES".
-
- V View the database. All used subjects are listed in order of entry,
- followed by the number of forward references and number of reverse
- references for each.
-
- X Exit the program. You will be asked if you wish to save the database
- before the exit occurs. The default is not to save the database.
-
-
- ART-CEE AND DATA STRUCTURES
- Like all software, ART-CEE has to deal with four possible relationships
- within the data it handles. The simplest and least significant is the one-
- to-one relationship. An item in the database has just one connection with
- one other item in the database, and that's all. "IF A then B" defines such
- a relationship. If "A" is true then "B" is always true; if it were other-
- wise, the rule would be just one part of a larger set of truths, and "A"
- would connect with at least one more item besides "B".
-
- The one-to-many relationship involves multiple possibilities rising
- from the same item. "If A then B" is true part of the time, and "If A then
- C" also is true part of the time, but "B" and "C" cannot be true at the
- same time.
-
- The many-to-one relationship is created by such combinations as "If C
- then A" and "If B then A". Two items have the same kind of relationship
- with another item in the database; they lead to the same conclusion. This
- relationship is not the same as "If C then if B then A".
-
- Finally, the many-to-many relationship combines the one-to-many and
- many-to-one relationships. The following rules constitute a many-to-many
- relationship:
- If A then B.
- If A then C.
- If C then B.
- If B then C.
- The relationships between items intertwine.
-
- Expressing one-to-one relationships with ART-CEE is straightforward.
- Just enter a new rule. However, unless more rules are entered, transforming
- the one-to-one into one of the other types, the one-to-one relationship
- amounts to little more than a redefinition of the subject.
-
- One-to-many and many-to-one relationships are created by entering
- several rules having items in common on the 'many' side of the relationship.
- The grouping functions (commands 'B' and 'G') are then used to mark these
- common elements as having a mutually exclusive relationship with each
- other. The result is a hierarchy of information that can be diagrammed as
- a triangle:
- A B C
- / \ \ /
- / \ or \ /
- B C A
- Use of the grouping functions assures that all relationships in the
- above diagrams are vertical ("If B then C" and vice versa cannot be
- true without transforming the relationships into the many-to-many kind).
- This structure is appropriate for classification schemes, diagnosis patterns
- and family trees.
-
- Many-to-many relationships form a more nebulous pattern in which each
- element theoretically can rise from and lead to every other item. Such
- structures can be used to identify patterns in which statistics are known
- about the data but overall structure of the data is unclear. Many-to-
- many patterns can also be used to interrelate two or more hierarchies.
-
- The big factor with ART-CEE is that ART-CEE just loves to transform hier-
- archies (many-to-one and one-to-many relationships) into many-to-many forms.
- The drawing of inferences in the way that ART-CEE does this transformation.
- If you have build a hierarchical structure and want to keep it that way,
- be careful to do the following:
- 1. Do not use the 'think' function against your permanent database.
- 2. Do not add query findings to the database.
- 3. Do not use assumptions.
- 4. Use the 'grouping' functions completely.
- If you wish to break any of the above, be sure to save the database first,
- then do not save the database when exiting from ART-CEE. Of course, if you
- are working with a nebulous, highly interconnected database to begin with,
- feel free to infer to your heart's content.
-